Nvidia Llama 3.1 Nemotron 70B Instruct HF AWQ INT4
This is NVIDIA's AWQ 4-bit quantized version of the Llama-3.1-Nemotron-70B-Instruct model, customized based on Meta's Llama-3.1-70B-Instruct, focusing on improving the usefulness of generated responses.
Large Language Model
Transformers Supports Multiple Languages